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Abstract 

■"1 ! 

C/3 ! 

Q , We consider the detection of binary (antipodal) signals transmitted in a spatially multiplexed fashion 

over a fading multiple-input multiple-output (MIMO) channel and where the detection is done by means 
^ ■ of semidefinite relaxation (SDR). The SDR detector is an attractive alternative to maximum likelihood 

' (ML) detection since the complexity is polynomial rather than exponential. Assuming that the channel 

o ' 
^ . 

, the SDR detector achieves the maximum possible diversity. Thus, the error probability of the receiver 

■ tends to zero at the same rate as the optimal maximum likelihood (ML) receiver in the high signal to 

^ ' noise ratio (SNR) limit. This significantly strengthens previous performance guarantees available for the 
semidefinite relaxation detector. Additionally, it proves that full diversity detection is in certain scenarios 

, also possible when using a no n- combinatorial receiver structure. 

Index Terms 

Semidefinite relaxation, diversity, MIMO, detection. 
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matrix is drawn with i.i.d. real valued Gaussian entries, we study the receiver diversity and prove that 



SUBMITTED TO THE IEEE TRANSACTIONS ON INFORMATION THEORY 



2 



I. Introduction 

Herein, we consider the detection of binary symbols transmitted over an n by m multiple-input multiple- 
output (MIMO) channel modelled according to 

y = Hs + v (1) 

where s G B"^ = {±1}"*, H e M"^™ and v, y G M". In what follows, y is referred to as the vector of 
received signals; H as the channel matrix; s as the transmitted message; and v as the additive noise based 
on their physical interpretations in the digital communications context. The additive noise is assumed to 
be white and Gaussian with a variance of p^^ per component. It will also be assumed that the channel 
matrix, H, is known to the receiver and that all possible transmitted messages, s, are equally likely. 

The problem of detecting a vector of symbols (not necessarily binary) transmitted over a MIMO 
channel is of general interest as it arises frequently in digital communications. Examples include, but 
are not limited to, the multiuser detection problem in CDMA [1] and communications over a multiple 
antenna channel [2]. However, while the detection problem is the same for many areas, the structure and 
assumptions regarding the channel matrix, H, will typically differ depending on the specific context. In the 
interest of simplicity, we will assume that the channel matrix may be modelled using i.i.d. Gaussian entries 
with zero mean and finite variance, an assumption motivated by the problem of wireless communication 
over a richly scattered fading multiple antenna channel [2]. The signal to noise ratio (SNR) of the channel 
is equal to p and we wiU focus on an analysis of the high SNR regime. 

The maximum likelihood (ML) estimate of s, Sml> is well known to be given by 

Sml = arg min ||y - Hs||^ (2) 

where || • || denotes the Euclidian norm, i.e. the ML detector, or receiver, selects the message, s, which 
minimizes the distance between the received signals and the hypothesized noise-free message, Hs. An 
error is declared whenever sml 7^ s and it well known that the ML detector is optimal in the sense that 
it minimizes the probability of error. However, for a general channel matrix, H, and vector of received 
signals, y, the ML detection problem in ^ has been shown to be NP-hard [3] and the full search solution 
has a complexity of 0(2™) where m is the number of symbols jointly detected. A similar result holds for 
the sphere decoding algorithm which is able to provide exact solutions to ^ at an expected complexity 
on the order of 0(2'^"^) for some 7 G (0, 1] [4]. The complexity is thus, although significantly lower 
than the full search, still exponential. 

Thus, the use of suboptimal (but computationally advantageous) alternatives to ML detection is 
motivated. However, when applied to a fading channel there is unfortunately often a significant loss 
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in performance associated with many of the suboptimal alternatives. This is illustrated in Fig. [2 where 
the probability of error for three different detectors is shown for the case where H G M^^'^. By comparing 
the ML detector and minimum mean square error (MMSE) detector [2] it can be seen that not only is the 
MMSE suboptimal, but the rate at which the probability of error tends to zero with increasing SNR is 
significantly lower than that of the optimal ML detector. This in turn results in a large loss in performance 
in the high SNR regime. The rate at which the error probability vanishes, or more precisely the slope 
(in log-log scale) of the error probability curve in the high SNR regime, is commonly referred to as 
the diversity of the detector and it is well known that the MMSE detector has a significantly lower 
diversity than the ML detector [2]. However, the third curve in Fig. ^ shows the probability of error for 
a receiver structure known as the semidefinite relaxation (SDR) detector or receiver. The SDR detector 
was (in the communications literature) first proposed in [5], [6], [7] for CDMA multiuser detection but 
is applicable for the detection of binary signals transmitted over any MIMO channel on the form of Q. 
The SDR receiver is based on a convex relaxation technique where the optimization in ^ is simplified 
by first expanding the feasible set and then applying a rounding procedure to obtain an approximate 
solution to Note that this statement is also true for the zero forcing (ZF) and MMSE receivers 
where an unconstrained least squares problem (a regularized least squares problem in the MMSE case) is 
initially solved and where the symbol estimates are then obtained by componentwise threshold decisions. 
However, the semidefinite relaxation differs from ZF and MMSE receivers in that the problem is first 
lifted into a higher dimensional space before the relaxation takes place. From Fig. ^ it is apparent that 
the SDR receiver, although suboptimal in the sense that it does not achieve the minimum probability of 
error, does not suffer the loss in diversity experienced by the MMSE receiver. 

The main contribution of this work is the analytic proof of the observation above. Namely, if the 
entries of H G ^nxm j ^ero mean Gaussian with a finite variance and n > m, then the SDR 
receiver achieves the maximum possible receiver diversity. The result is formally stated in Theorem ^ in 
Section ITl-B I and represents a non-trivial extension of previously known performance guarantees available 
for the SDR detector, see e.g. [8], [6], [9]. 

The topic of receiver diversity has received significant attention in the digital communications literature 
and other low complexity receivers have been designed specifically with diversity in mind. Perhaps, most 
prominent among these receivers are the lattice-reduction-aided (LRA) receivers [10], [11]. In the LRA 
receiver one performs a change of basis under which the conditioning of H is improved and then applies 
a simple (e.g. ZF, MMSE or decision feedback) detector in the new basis. It has also recently been 
shown that it is possible to construct (low complexity) full diversity receivers based on these ideas [12], 
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again under the assumption that n > m. However, the design philosophies underlying the LRA and SDR 
detectors are fundamentally different. Were as the LRA is combinatorial in nature the SDR detector is 
based on the minimization of a continuous function over a convex set. Further, in the LRA receiver it is 
assumed that the transmitted message belongs to an (infinite) integer lattice which enables the change of 
basis while in the SDR approach explicit use is made of the binary symbol assumption. 

As previously stated, we treat the SDR receiver under the assumption that the channel matrix is 
i.i.d. Gaussian and real valued. The main reason for this is that the SDR receiver is most easily treated in 
the real valued case. It should however be mentioned that the extension to the complex case is non-trivial 
and that numerical results suggest that a theorem, analogous to Theorem ^ may not hold in this case. 
However, the numerical results also indicate that the loss in diversity (with respect to the ML detector) 
remains small. We discuss this issue further in Section IVI-BI Additionally, the underdetermined (n < m) 
case is treated in Section fVI-AI In the latter case our proof of Theorem ^ provides a lower bound on the 
diversity achieved by the SDR receiver which shows that if m — n is not too large, then the diversity of 
the SDR is strictly larger than that of the MMSE and ZF receivers. 

In Section |n] we review the SDR receiver and present the main contribution of this work, namely 
Theorem ^ In Section |ffl] a short outline of the proof is given and the rigorous analysis is given in 
Section |W] and Section |V] Further, a short discussion of how the results may possibly be generaUzed to 
other scenarios is given in Section IVII Also, although it makes no difference for the analytical results, 
we will in the numerical examples normalize the channel matrix, H, such that each component has a 
variance of n~^, yielding unit energy symbols at the receiver. 

II. Semidefinite Relaxation 

The use of semidefinite relaxation for bounding the optimal value of a combinatorial optimization 
problem was first considered in the late seventies [13] (where it was used to bound the Shannon capacity 
of a graph). Theoretical work in the nineties [14] along with the introduction of practical methods for 
solving semidefinite programs [15], [16], [17] made the semidefinite relaxation a viable method for finding 
approximate solutions to many combinatorial problems. A famous example where the SDR technique 
can be applied is the max cut problem in graph theory [18]. The application of SDR to the detection 
problem considered herein has also been studied in the communications literature [5], [6], [7]. 

We will in Section ITl-AI provide a short review of the SDR detector in the communications context. It 
is not the intention to give a complete treatment of the SDR detector in terms of implementation or to 
discuss the various improvements which have been proposed but rather to introduce notation and capture 
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specific assumptions made herein. The reader is instead referred to the original works [5], [6], [7] for a 
thorough treatment of the SDR detector in the context of digital communications. See also, apart from 
the above, [19] for a comprehensive collection of results regarding semidefinite programming in general 
and also specific results regarding the semidefinite relaxation technique. 

A. The SDR Detector 

In order to introduce the semidefinite relaxation technique it is useful to note that the (non-convex) 
optimization problem given by 

min Tr(LX) 
X, X 

s.t. diag(X) = e (3) 
X = xxT 

where e is the vector of all ones and where 

^ H^H -H^y] ^ [s 

> x4 (4) 
-y^H yTy J [l 

is equivalent to Q in the sense that the solution to © is easily obtained from the solution to Q and vice 
verse [5], [6], [19]. Essentially, the formulation of ^ is obtained by lifting ^ into a higher dimension 
where the criterion is linear in the optimization variable. The rank one constraint on X along with the 
diagonal constraint ensure there is a one to one correspondence between the feasible sets of © and 
The optimal point of ^ is related to the optimal point of ^ through x as shown in 

As ^ and ^ are equivalent they are also equally hard to solve from a complexity theoretic point of 
view. In particular, it follows from [3] that Q is also NP-hard in general. However, consider now instead 
the optimization problem given by 

min Tr(LX) 
X 

s.t. diag(X) = e (5) 
X ^ 

where X ^ means that X is symmetric and positive definite. Since X = xx^ implies X ^ it follows 
that ^ represents a relaxation of The problem in ^ is referred to as the semidefinite relaxation 
of Q (or equivalently ©) and serves as the basis for the semidefinite relaxation detector. 

It is useful to note that (|5ll is a convex problem which can be efficiently solved in polynomial time [16], 



[20]. In particular, there is an interior point algorithm which solves ^ to any fixed precision in 0{ 



m' 



3.5^ 



time [21], see also [5] where this algorithm is presented in the digital communications context. In practice. 
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only a few iterations with a complexity comparable to that of inverting an m by m matrix are required 
in order to obtain an approximate solution to 

It is straightforward to see that when the optimal solution to Q is rank one it is also an optimal 
solution to The existence of rank one solutions to ^ is however by no means guaranteed and in 
general, the solution to ^ can only serve as a basis for obtaining an approximate solution to In fact, 
it is possible to characterize exactly (in terms of H, s and v) when Q will and will not have rank one 
solutions, see [22] for necessary and sufficient conditions. 

When the optimal point of ^ is not rank one, some type of rounding procedure has to be used to 
round the optimal point of Q to a point in the feasible set There are several suggestions for this 
in the literature. Among the more powerful approaches are a randomization technique [18], [6] and an 
approximation using the dominant eigenvector [5]. Numerical evidence suggests that the randomization 
technique results in superior error performance. We shall however consider the very simple strategy of 
simply using the signs of the last column of X* where X* is an optimal point of This approach 
was also mentioned in [5] but discarded in favor of the (superior) eigenvector approach. However, as the 
simpler approach already achieves the maximum diversity we shall only consider this approach in detail. 
It should however be noted that the proof extends to the dominant eigenvector case in a straightforward 
manner by simply appealing to results regarding the continuity of eigenvectors corresponding to distinct 
(multiplicity one) eigenvalues. 

To summarize, we obtain the SDR estimate, ssdr as follows. Let X* be the minimizer of Then 
ssDR is defined according to 



is the sign function, i.e. ssdr is given by the signs of the last column of X*. Note that although it is 
possible for ^ to have several optimal solutions it is always possible to pick some unique optimizer, 
X*, from the optimal set. Thus, it can be assumed that ssdr is uniquely determined by y and H. 

Finally, it should be mentioned that extensions to the original semidefinite relaxation detectors have 
appeared in the literature. These include for example extensions to M-PSK constellations [23] and M- 
QAM constellations [24]. However, the analysis of these extensions is not treated herein. 



[ssdr]* = sgn([X*]i,m+i), 1 = 1,... 



m 



(6) 



where 



sgn(x) 
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B. SDR Performance 

The extraordinary performance of the SDR technique in many areas have been a motivating reason 
for its study and there are results in the literature regarding the quality of the semidefinite relaxation 
approximation of Q for more or less arbitrary choices of the matrix L (in Q). These include the bound 
in [8] which is a generalization of a previous result for the max cut problem [18]. There are also some 
results relating the semidefinite relaxation to other relaxations available for binary quadratic programs 
(such as ^) [25]. 

In the context of digital communications it has previously been shown that several low complexity 
detectors may be viewed as further relaxations of the SDR detector [6]. Notably, these low complexity 
detectors include both the ZF and MMSE detectors and give strong support for the SDR approach although 
the results in [6] relate to the objective values of the relaxations rather than directly to the quality of 
the estimates, s. Further, a probabilistic bound on the difference in optimal objective value between ^ 
and Q was given in [9] for the large system limit. Also, as previously mentioned, the conditions for rank 
one solutions to ^ were complectly characterized in [22] where it was also established that the detector 
was free of an error floor under the assumption that H^H is full rank. However, the result in [22] does 
not extend to a statement regarding the diversity. Specifically, it is possible to show (using the result 
of [22]) that an alternative SDR receiver which calls an error whenever Q is not of rank one would 
not have the maximum diversity. In other words, the second phase of the SDR receiver where high rank 
solutions are used to obtain symbol estimates is crucial to the SDR performance and must be taken into 
account in the analysis. 

The main contribution of this work is a rather strong statement regarding SDR performance when 
applied to a fading channel, namely that under the model in Q with an i.i.d. Gaussian channel for 
which n > m the SDR detector will have a diversity equal to that of the optimal, ML, detector. Loosely 
speaking, although suboptimal, the SDR detector will have an error probability which vanishes at the 
same rate as the ML detector in the high SNR limit and the loss due to suboptimality will be a shift in 
SNR and not a loss of diversity. We formally state this as follows. 

Theorem 1: Assume that H G ]^"x»" in Q consist of i.i.d. Gaussian entries of zero mean and fixed 
(non-zero) variance. Assume further that n > m. Then 

j.^ lnP(ssDR / s) ^ ^.^ InP (sml 7^ s) ^ _n 
p-*oo In p p-^oo In p 2 

It is important to note that the SDR (and maximum) diversity is ^ in this case and not n. This is 

because we explicitly consider a real valued channel matrix Q as opposed to the complex channel case 
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more frequently studied in the literature. It is straightforward to show the maximum achievable diversity 
in this case is ^ by extending the proof of [26] to cover the real valued case. In the case of ZF and 
MMSE the diversity is which can be seen by following the argument of Section 8.5.1. in [2] 

with a real valued channel matrix. 

Following [27] we will throughout this work make use of the symbol = to denote exponential equality, 
defined according to 

f{p) = p-" ^ lim = -d. (7) 

Similar definitions will also apply to the symbols < and >. For reference, we list the most important 
properties of the exponential equality in Appendix |l] Using (0 generally allows for a more compact (and 
suggestive) notation and in this notation the statement of Theorem ^ becomes 

P (SSDR / S) = P (smL / s) = p^t. 

Now, most of remaining part of this work is devoted to the proof of Theorem The formal proof 
is divided into several lemmas presented in Section |W] and Section |V] However, before presenting the 
proof in full, a short outline is given in Section |ffl] 

III. The SDR Diversity Proof, Outline 

Note that due to the symmetry of the problem (and the detector) it can without loss of generality 
be assumed that s = e was transmitted. This will also be done in the sequel. In the m = 2 case it is 
possible to graphically illustrate the feasible set, X, of ^ in order to gain intuition. To this end, consider 
parameterizing X G ^ as in [28] or [5], i.e. according to 

1 X y 
X = X 1 z 
y z I 

The feasible set, X, is illustrated in Fig. |2l The rank one matrix, Xg, that corresponds to the transmitted 
message, s = e, is also indicated in the figure. 

Intuitively, one can characterize the error events of the SDR receiver as follows. When the optimal 
point of X*, is close to Xe then the rounding procedure described in Section |n] will be able to 
recover the correct rank one matrix, namely Xg. It is only when the optimal point of ^ is far from Xe 
that an error can occur. 

Consider now the introduction of a hyperplane, H, as in Fig. |2l that separates the points in X that are 
close to and far from Xg. Specifically, let Xj^ be the points in X that are on the same side of Ti as Xe 
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and let be the points on the other side. Assume also that H is chosen such that points in X-^- are 
rounded off to Xe- Let us also first consider the zero noise case, i.e. when v = 0. In this case Xg is 
always optimal for ^ with a criterion value equal to 0. Further, let r > be given by 

T = min Tr(LX), 

i.e. T is the minimum objective value over the intersection of the hyperplane and the feasible set, assuming 
V = 0. As the criterion function, Tr(LX), is linear and X is convex it follows that the criterion function 
for any X £ will also satisfy Tr(LX) > r. 

Now allow for V 7^ but assume that ||v|| is significantly smaller than r. In this case, Tr(LXe) is still 
small as Tr(LXe) is continuous in v. At the same time it is guaranteed that Tr(LX) is not significantly 
smaller than r for any X G X^, again since Tr(LX) is continuous in v. This implies that there is a 
point in X^ with a criterion value close to zero, while all points in X^ have objective values which are 
at least on the order of r. In other words, the optimum over X must belong to X^ and therefore be close 
to Xg. This in turn implies that no error is made by the SDR receiver. In short, it is sufficient that r is 
large in comparison with the noise in order for the detector to make a correct decision. This statement 
is also made rigorously by Lemma ^ in Section |W] 

The proof of Theorem [2 follows the heuristic argument given above and is divided into two parts. 
The first part, is concerned with proving that the error probability of the SDR detector is, in the high 
SNR regime, governed by the probability that r is atypically small rather than the probability that v 
is atypically large. This statement is formalized by Lemma El in Section |W] Note that the technique of 
interpreting typical errors as caused by particularly bad channels (in our case channels which cause r to 
be small) is common in the literature, see e.g. [2]. It is also similar in many respects to the analysis of 
coded multiple antenna systems where errors are typically caused by channels in outage [27]. 

The second part of the proof, contained in Section |3 is concerned with bounding the probability that 
T is atypically small. Note that in order for r to be small there must be at least one X G ^ n H for 
which Tr(LX) is small. In essence, the technique used to establish our bound on the probability of r 
being small can be summarized as follows. 

1) Cover Xr\T-L {ox more precisely a set isomorphic to X nH) with e-balls and bound the probability 
that each specific e-ball contains an X for which Tr(LX) is small. 

2) Count the number of e-balls required to cover X n H and use the union bound to bound the 
probability that r is small. 

Much of the difficulty of the proof stems from that the probability that each e-ball contains an X for 
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which Tr(LX) is small depends on where in A" n 'H the e-ball is located. Also, the technically most 
challenging part of the proof relates to counting the number of e-balls required to cover certain subsets 
of X nH. The analysis of each particular e-ball is provided by Lemma |5] and the counting argument is 
captured in Lemma |4] in Section |V] The proof of Theorem ^ given at the end of Section fVl then follows 
by combining Lemma |3] and Lemma |4] 

IV. The SDR Diversity Proof, Part I 

The purpose of this section is to give rigorous justification of the first part of the heuristic argument 
given in Section |ffl] and show that the noise, v, can effectively be removed from (or integrated out of) 
the analysis of the receiver diversity. To this end, we will begin by giving a proper definition of some of 
the concepts appearing in the heuristic argument. 

First of all, the feasible set, X, of ^ is given by 

X ^{Xe S'^^^ I diag(X) = e, X ^ 0} (8) 

where S™+i denotes the set of symmetric matrices. Let H be the hyperplane (or affine subset of 
given by 

W = {X E §"'+^ I Tr(MXM'^) = 1} (9) 



where 



It will later be established that an H chosen this way is sufficient for separating point close to Xg from 
points far from Xg. The optimal value of Tr(LX) over the intersection set X PiH is under the zero 
noise, v = 0, assumption given by 

r= min Tr(LoX) (11) 



where 

^ Q -Qe 



M^QM 



-e^Q e^Qe 

and Q = H'^H. Note that Lq is equal to L in © when v = 0. It is also straightforward to show that r 
is equivalently given by 

T = inf Tr(QY) (12) 

where 

y = M{x nn)M^ = yn{Y \ Tt{y) = 1} (13) 
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and 

j> = MXM^. (14) 

The set 3^ is a linear mapping of X C S™"*"^ onto given by MXM^ under which the criterion 
Tr(LoX) and H have a somewhat simpler structure. Note also that y is convex since it is a linear 
transformation of a convex set. The main reason for introducing (fT2b is that it is frequently more 
convenient to work with (fT2l rather than with (fTTt directly. 

We are now able to pose and prove the first lemma regarding the error probability of the SDR detector. 
In essence, we wish to establish that a large r is sufficient for correct detection. These statements are 
captured by Lemma ^ given below (note again that s = e is assumed to be the transmitted message). 

Lemma 1: Let r be given by (ITTT i. Then 

|2 



T > 4 V 



SSDR 



Proof: We will first prove the lemma under the assumption that the optimal point of ^ is rank deficient 
and then argue that this assumption can be made without loss of generality. Thus, consider an X G ^ 
for which X ^ (X is positive semidefinite but not positive definite) and partition X as 



X 



AT 



A a 



A^A A^a 
a^A a^a 



where A G ]R™x»" and a G R"^. Note that this is possible since X has at most rank m. Note also that 
||a|| = 1 follows from diag(X) = e. Further, note that the matrix L defined in (HJ can be written as 

H - 







-HTy 




hT 






T 

y y 




T 

-y 



Thus, 



Tr(LX) =Tr 



=Tr 



RT 

T 

-y 

H - 



H 



AT 




A a 



=Tr((HA^ -ya^)(HA^ -ya') 



|HAT-yaT||2 



where || • || above refers to the the Frobenius norm. Now, the model of Q for s = e yields (through y) 



Tr(LX) = ||H(AT - ea^') - va 



,T||2 
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Note that 

||H(A-ea'^) - va'^ll 
>||H(A-ea'r)|| - Hva^^H 
= ||H(A-ea'^)|| - ||v|| 

where the last equaUty follows from ||a|| = 1. Thus, whenever 

||H(A - ea^)|| > 2||v|| ^ ||H(A - esJ)f > 4||vf 

it follows that 



TrfLX) > llvlK. 



At the same time, for 



it follows that 



eT 1 



TV(LXe) =Tr 



=Tr 



H -y 



H -y 



1 



eT 1 



=TV((He-y)(He-y)T) 
= ||He-yf = ||vf . 
Thus, by ^ and (O, it follows that 

||H(A - ea^)f > 4||vf ^ Tr(LX) > Tr(LXe) 
which implies that X can not be optimal for ^ if 

||H(A - ea'^)|P > 4||v||2 ^ ||H(A - ea^)!! > 2||v|| 



Now, note that 



(A - ea^) = M 



AT 
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for M defined in (fTOt and 



|H(A-ea' 



=Tr HM 



Tmi2 



A a 



TxjT 



=Tr H^HM 



AT 



A a 



=Tr(HTHMXMT). (18) 
Let X* e X he the optimal point for Q and let Y* G j> be given by Y* = MX*MT. Note that 

Tr(QY'') < 4||vf 

for Q = as otherwise X* would not be optimal due to (fTTb and (fTSb . 

Assume (as in the lemma) that 

T > 4||v|p. 

This implies that Tr(QY) > 4||v||2 for any Y £ y. The same conclusion could also be drawn for 
any Y £ y which satisfies Tr(Y) > 1. This follows since 3^ is a convex set which contains (since 
= MXeMT). That is, if there were Y G for which Tr(Y) > 1 and Tr(QY) < 4||vf then 
Y = 7Y G 3^ for some 7 G (0, 1] and Tr(QY) < 4||v|p contrary to the assumption. 
Thus, under the assumption of the lemma, it follows that 

Tr(Y*) < 1 

and ||diag(Y*)||oo < 1 as Y* ^ implies that Y* has positive diagonal elements. Now, partition X* as 



X* 



B b 



1 



where diag(B) = e due to diag(X*) = e. Computing Y* explicitly under this partitioning yields 

Y* = MX*mT = b - ebT - beT + eeT 



i||diag(Y* 



which implies 

||g b||oo — 2 I 

since diag(Y*) = 2e — 2b. Thus, the rounding procedure given in ^ will round the last column of X*, 
namely b, to e and it follows that ssdr = e. 
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What remains now is to show that the optimal point of ^ must be rank deficient. By applying the 
result in [29] it is known that there will always be a rank deficient optimal point. A potential problem 
could arise if there are several optimal points, some of which are full rank. We will however show this 
that this is not possible. 

In order for any optimal point of ^ to be full rank, all off diagonal elements of L in (|4li must 
be identically zero. This follows since otherwise there would be a search direction in the nuUspace of 
diag(X) = e for which the criterion function would decrease, contradicting the optimality of any full 
rank X. Thus H'^H has zero off diagonal elements (as it appears in L) and H has orthogonal columns. 
In this special case the SDR will always have rank one solutions which are unique as long as the ML 
problem has a unique solution [22]. However, the assumption that r > 4||v|p implies that 

||y-Hef < ||y-Hsf 

for any s G B"^, s 7^ e, and it follows that the ML solution is unique. Therefore, there are no full rank 
solutions under the assumption in the lemma. This completes the proof. ■ 
Essentially, Lemma ^ states that for an error to occur in the high SNR regime one of two thing must 
happen. Either r is atypically small or v is atypically large. As stated in Section |ffl] it can be argued that 
the probability of the former event outweighs the probability of the latter. This is formally stated by the 
following Lemma which concludes this section. 

Lemma 2: Let r be given by dTTT i. Then 

P (r < p-^) < p-'^ P(ssDR/e)<p-'^. (19) 

Proof: Assume (as was done in the lemma) that 

P(r<p-i)<p-'^. 

This, combined with P (r < p^^) < 1, implies that for any arbitrarily small S > there is a constant, c, 
for which 

P (r < p-i) < cp-^+'^ 

for all p> 0. Now, by Lemma [l] 

Pe = P(Me) <P(r<4||vf). 

Introduce a Gaussian vector, w G M", with i.i.d. zero mean elements of variance one and note that 
/9^^||w|p has the same distribution as ||v|p. Let /||w|p(7) denote the probability density function of 
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7 = ||w|p. As r is independent of v (and w) it follows that, 

Pe < P (t < 4/>-^||w|p) 
roo 

= P (r < 4p-i||wf I ||wf = 7) /||w|p(7)d7 
Jo 

P(r< 4^-17) /||w|p(7)rf7 

/•CO 

Jo 

= c4'^-V"'+'E {||wf('^-^)} = cV'+' 
for some c' independent of p. Note that c' < cxd follows since ||w|| has finite moments. Thus, 

However, as the relation holds for arbitrary small 5 > it follows that 



Pe<p-"' 



which concludes the proof. 



V. The SDR Diversity Proof, Part II 

Let r be given by (fTTl or equivalently (fT2t . In light of Lemma |2l all that remains to be done in order 
to prove Theorem [2 is to provide a bound on 

P (r < p-i) 

in the high SNR limit. Note however that at this point the variable p^^ is just a dummy variable and we 
can, and will, replace p^^ by e and study the probability that r < e for small e > 0. Thus, what remains 
to be done is to bound P (r < e) around e = 0. We will also in the remaining part of this work focus 
on the optimization problem given in il2l rather than the equivalent problem in (fTTT l. 

The probability that Tr(QY) < e for some particular Y G 3^ will generally depend on the specific Y 
considered (as mentioned in Section Hill. In order to deal with this we shall first partition y into a finite 
number of subsets {3^i}, 

i 

such that P (Tr(QY) < e) is more or less constant for all Y within one such subset. Then, the probability 
that T < e will be bounded by applying the union bound according to 

P(r<e)<^P(Ti<e) (20) 
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where 



Ti = inf Tr(QY) 



and where by property (I37bt in Appendix |l] it is known that the sum in (l20l will in the exponential 
equality sense be given (or completely dominated) by its maximal term. 

It is interesting to note that this corresponds to the identification of typical error events (or classes of 
error events), which is closely related to the analysis of typical outage events in [27]. However, in [27] 
typical events where identified by classifying particularly bad channels, H, while here, we shall use the 
concept to identify particularly troublesome subsets of y. In essence, we shall partition y based on the 
eigenvalues of Y G 3^ (or how close to singular Y is). Then the subset which dominates (EUl will be 
found by optimizing over the possible eigenvalue combinations. Note also that these subsets will generally 
depend on e but that we will adopt a somewhat casual terminology and refer to them simply as subsets 
rather than by the technically more correct term "sequence of subsets". However, before considering the 
general partitioning of y into such subsets we will treat two motivating, and relatively simple, special 
cases to gain intuition. 

A. Special cases 

1) Rank one matrices: First, let us consider the set of rank one matrices Y G 3^, i.e. the set given by 

3;^! A3;n|Y I Rank(Y) = 1}. 

For any particular Y in this set, with an eigenvalue decomposition given by Y = auu^ where ||u|| = 1, 
we have 

Tr(QY) = ^Tu^Qu. (21) 

As (T = 1 due to the constraint Tr(Y) = 1 it follows that 

P (Tr(QY) < e) = P (||Huf < e) = 

for this particular Y € ym- It can also be shown that there are exactly 2"^ — 1 distinct Y € J^ri. In 
essence, each such Y corresponds to the point at which line (in X) connecting 



sT 1 



and Xe intersects the hyperplane H, given in Therefore, by applying the union bound to the finite 
number of rank one Y G y^i it follows that 

P {tri < e) = e = 
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where 



rRi= inf Tr(QY). 

I Sj'R 1 



Note also that there is a one-to-one correspondence between the rank one matrices and all possible 
messages (not equal to the transmitted message), s G B'^\e, that are searched over by the ML detector. 
This is also the reason why 

P (tri < e) = P (sml = e) . 

2 ) Full rank matrices: Next, consider the set of full rank (or more precisely well conditioned) Y G 
given by 

J'FR = 3^n{Y I Y^cl} 



for some constant c > 0, and let 



TFR^ inf Tr(QY). 



As the criterion function, Tr(QY), may be bounded as 

Tr(QY) > cTr(Q) = c||Hf 
for any Y G 3^fr it follows directly that 

P (tfr < e) < e"^ 

by applying property (I37dt in Appendix U This result can also be strengthened to show that 

P (tfr < e) = e^. 

3) Discussion: The implication of the result in Sections IV-A. II and IV-A.2I is that the event that r < e 
is (in the limit) much less likely to be caused by one of the matrices in 3^fr than one of the matrices 
in 3^Ri. The probability of the former is on the order of while the later is only and e~ <C e2 
when e is small (provided m > 1). Thus, (in a very loose sense) the reason for the high diversity of the 
SDR detector is that the elements added in the relaxation (the ones in 3^fr) are less likely to cause errors 
than the elements already present in the feasible set of the ML detection problem (the ones in 3^ri). 

The question which however remains to be answered is if there is some other set of Y, somewhere 
between the full rank and rank one matrices, which can cause r < e to occur with a probability 
substantially larger than e~ . The answer to this question is somewhat surprisingly no provided that 
n > m (but yes in some n < m cases). In fact, most of the remaining part of the paper is concerned 
with the formal proof of this statement. 
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B. The General Case 

In the general case we consider sets on the form given by 

3^(a,b) ^3^n{Y I e"*- <cTfc(Y) <e^''} (22) 

where a = (oi, . . . , am), b = (61, ... , hm) and (Tfc(Y) denotes the /cth eigenvalue of Y. For notational 
convenience we will also in d22l) interpret e"*" as for = cxd in order to allow one or more eigenvalues 
to be identically equal to zero. We can without loss of generality assume that the eigenvalues are ordered 
and that < ai < . . . < am, = bi < . . . < bm and b^ < a^ for k = 1, . . . ,m. Note that the assumption 
that 61 = can be made since i22l would, due to the Tr(Y) = 1 constraint of y in ( I13t . be empty 
otherwise. Similarly to before we define 

T(a,b)^ inf Tr(QY). (23) 

Ygy(a,b) 

In what follows, a bound on the probability of T(a, b) < e is obtained by first partitioning 3^(a, b) 
into even smaller sets (essentially e-balls) and then using the union bound to bound P (r(a, b) < e). It 
will be more convenient to work with a square root factorization of Y G 3^ instead of with Y directly. 
Thus, we define a function, 

: §^ ^ M.rnx^ (24) 

(where denotes the set of symmetric, positive semidefinite matrices) for which A = ^(Y) satisfies 
A = US 2 and where USU^ = Y is the eigenvalue decomposition of Y. That is, ip provides square 
root factors of Y which have orthogonal columns with norms equal to Let ^(a, b) be given by 

^(a,b) ^^(3;(a,b)), (25) 

i.e. ^(a, b) is the set of square root factors which can be obtained from Y G y{a.,h). Note that 
Tr(QY) = ||HA|p since Q = H'^H and A = y?(Y). The random variable r(a,b), defined in can 
thus be equivalently defined by 

r(a,b)= inf llHAf. (26) 

AeA{a,h) 

We are now ready to provide the first lemma regarding the probability that ||HA|p < e for any A in 
an e2-ball around a given center point A G ^(a, b). 

Lemma 3: Consider A G A{a, b) and define 

A(A) = {A I ||A-A|| <ei}. (27) 
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Further, let 



r(A) = inf ||HAf . (28) 
AeA{A) 



Then, 

n(l - afc)+ 



P (t(A) < e) < e'' where ^ = 



2 

k=l 

and where (•)+ = max(0, •). 

Proof: Note that, due to the rotational symmetry of the distribution of H, it can without loss of generality 
be assumed that A is diagonal (and equal to where S is a diagonal matrix containing the eigenvalues 
of Y G 3^ for which A = ^^(Y)). 

Pick some 5 > and consider the event that 

||H|| < e'^ (29) 

and where at least one column of H, h^, satisfies 

||hfc||>2e^~^ (30) 

We will first show that this event implies that r(A) > e and next that the event fails to occur with a 
probability which is no larger (in the < sense) than e^~^^^ . Hence 

P (t(A) < e) < P (||H|| > e-^ U ||hfc|| < 2e^-^ Vfe) 



Note first that d30b impUes 



for at least one k since > e"*"- Note also that this implies 

||HA|| = IIHS2II > 2e-2~^. 

Now, consider ||HA|| for any A satisfying ||A — A|| < . Under the additional assumption of M9\ it 
follows that 

||HA|| =||HA-H(A- A)|| 

>||HA|| - ||H(A- A)|| 
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where the last inequaUty holds whenever e < 1. Note also that ||HA|| > ea implies ||HA|p > e. 
Therefore, (|29ll and (|30|l implies that t(A) > e. 

Now, consider the probability that d30b fails to hold, e.g. that 

llhfcll < 2e^-'5 

for all /c = 1, . . . , m. As the columns of H are independent this probability can be upper bounded as 

P (^llhfcll < 2e^-^ V/t) 

m 

= J]p(||hfc|| <2e^- 

k=l 
m 

n r.(l-afc-2i) + 
e 5 < e 

k=l 



v—nmo 



where we have used 



P(||h|| <e2) =P(||hf <e^)<e- 



according to (I37dl) in Appendix U with e = p ^. The probability that (I29t fails to hold can be upper 
bounded as 

P i IIHII > e'*^) <e'^ 



according to (I37et in Appendix |l] Therefore, by applying the union bound, 

P (t(A) < e) < P > U ||hfe|| < 2e^"'^ Vfc 

However, as 5 > was arbitrary it follows that 

P (r(A) < e) < 

which concludes the proof. ■ 
The next lemma provides a bound on the number of e 2 -balls (defined as in illl ) which are required 
to completely cover the set ^(a, b). Lemma |3 is the technically most difficult result of this work and 
we discuss this lemma below but save the the stringent proof for Appendix ^ 

Lemma 4: Let ^(a, b) and Ae{A.) be defined as in (l25l and (l27l . respectively. Then there is a collection 
of points, 2t = {Aj}, for which 

^(a,b) C U A(A,) 
A,esa 
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and 

|2l| 

where 1 211 denotes the number of elements of 21 and where 



m 

A 



(31) 

k=2 

Proof: Given in Appendix |n] ■ 

Essentially, the proof of Lemma |4] relies on a geometric argument based on the dimensionality of low 
rank subsets of A. Specifically, as part of the proof of Lemmaj^it is shown that the set of rank r matrices 
AG A, i.e. 

A^r = ^ n {A I Rank(A) = r}, 
is part of a -dimensional (smooth) manifold where 

r 

dr = ^^(w- — k + 2), r = 2, . . . ,m 

k=2 

and di = 0. The manifold containing A^r is locally diffeomorphic (having a one-to-one differentiable 
relation) with the d,. -dimensional unit cube in M^' (this is a property of any smooth dj. -dimensional 
manifold [30] and not specific to Ajir)- The volume, V, covered by one -dimensional e 2 -ball is on the 
order of 

V = {e^y- = €- 

and therefore one needs on the order of 

AT = 1 = (32) 

such £2 -balls to cover the unit cube in M.'^''. By exploiting that there is a differentiable (and therefore 
continuous) map between the unit cube and the manifold this result carries over to a covering of ^r^. 
Thus, the set of rank r matrices, ^4^,., can be covered by a collection of points, 21^, satisfying 

1 2tr I 

where 

dr (m — k + 2) 

= Y = 2^ 2 ■ 

k=2 

Extending this line of reasoning from rank r dimensional subsets, A^r, to subsets which are close to 
being low rank in the sense that the singular values of A are bounded by powers of e yields the result 
stated in Lemma |4] Note also that this is similar to the discussion following Theorem 4 in [27]. 
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Now, Lemma|3land Lemma|^can be combined in order to bound the probability that ^(a, b) contains 
an A for which ||HA|p < e. Then, by optimizing over a and b, one can find the set of the form of 
^(a, b) most Ukely to contain such an A. It can also be argued that this set will dominate the probability 
of error in the high SNR regime. These ideas are captured by the following lemma. 

Lemma 5: Let r be defined as in (fTTl . Then 



Proof: Consider picking some b = (5i, . . . , bm) for which 61 = and 61 < 62 < • • • < < 1 and 
choose a 5 > 0. Let a = (ai, . . . , am) be given such that ai = 5 and = 6^ + if 6^ + < 1 or 
Ofc = 00 otherwise for A; = 2, . . . , m. 

The probabihty that r(a, b) < e where r(a, b) is defined in d23l) can be bounded, using the union 
bound according as 



A,e2l 

where 21 is chosen according to Lemma 0] and where T(Aj) is given by ( I28t . Each term in the sum is 
upper bounded by 



P(T(Ai)<e)<6^ 

where u is given in Lemma |3] The number of terms in the sum is upper bounded by 

where /u is given by (I3U . Thus, the probability that T(a, b) < e is bounded as 

P(T(a,b) < t)<e''-^' 



P (r < e) < 



where 




(33) 



P(T(a,b)<e)< P(r(Ai)<6) 



where 



m 



n{l - ak) 
2 



+ 



m 



{m-k + 2)(1 - bk) 
2 



+ 



E 



k=l 



k=2 




k=2 



mn5 



2 
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and where the property 

(l-afc)+ > 

(for afc chosen as above) was used to establish the first inequaUty. The second inequality follows by the 
definition of in (l33l along with > 0. 
Now, let 

where ip is given by (l24ll . Note that we can pick a finite set of b E [0, l]"*, 53 = {hi}, such that 

^ C IJ ^(a, b) (34) 

bG33 

where a = a(b) according to the above. This follows since by specifying b = . . . , bm) we include 
the matrices Y G 3^ for which the kth eigenvalue satisfies e'"'+<^ < < 6^*= if 6^ < 1 and fj/t < e if 
6fc = 1. Thus we can cover the entire range of G [0, 1] with a finite number of bk G [0, 1]. For the 
special case of = 1 we know that cti is bounded away from due to Tr(Y) = 1 which implies that 
(Ti G [e*^, 1] for sufficiently small e given that 5 > which is why hi = can be assumed without loss 
of generality. 

Using the union bound it follows that 

P(r<e)<^P(r(a,b)<e) 

since each term in the sum satisfies 

P(r(a,b) < e) <e^~'^ 
and the number of terms is finite. However, as 5 > was arbitrary it follows that 

P(r(a,b) < e)<e'^ 

which concludes the proof. ■ 

In light of Lemma |5l the proof of Theorem [2 is now almost trivial. All that remains is to compute C 
in ( I33t and apply Lemma |2l We give the proof below. 

Proof (of Theorem\^: For the case where n > m all terms in the sum appearing in d33b are non negative. 
Thus, the minimum in is achieved for C2 = . . . = Cm = and it follows that 
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This, combined with Lemma |2j proves that 



P (SSDR / e) < /9 2 . 



Next, note that the error probabiUty of the SDR receiver is lower bounded by 

P (ssDR 7^ e) > P (sml / e) = 2 
since the ML detector achieves the minimum probabiUty of error. It therefore follows that 

P (ssDR / e) = P (sml / e) = p"2 . 

By noting again that s = e can be assumed without loss of generality the statement of Theorem 1 follows. 
■ 

VL Extensions 

At this stage, only the case of real valued systems on the form of O have been considered. Also, for 
the proof of Theorem ^ it was assumed that n > m. In this section, we discuss the extensions which 
would follow by relaxing these constraints and some illustrative numerical examples are given. 

A. The n < m case 

As stated above, full diversity has so far been shown under the condition that n > m. However, a 
careful inspection of the proofs show that the only part which explicitly relies on this assumption is when 
it is argued that C2 = . . . = Cm = is an optimal point for d33b in the n > m case. However, nontrivial 
bounds on the diversity will follow whenever C in ( l33t is strictly positive. The following theorem provides 
a lower bound on the diversity for the case when n < m. 

Theorem 2: Given the assumptions of Theorem^ but for r = m — n > 0, it holds that 



Proof: All that needs to be done in this case is to find the optimum in ( I33t and apply Lemma El To this 
end, note that the optimum of d33b is achieved for = 1 for all k satisfying 

n — m + k — 2<{)-^k<m — n-\-\ 



lim 



In P (ssDR / s) 
In p 



< -d 



where 




(35) 
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and Ck = for k satisfying 

n — m + k — 2>04^k>m — n + 2. 



The value of C in ( I33t is thus given as 

m— 

n \ - 



n ™ j^_j^_|_/j;_2 1/ r(r + 3) 
2 ^ 2 2 

fe=2 



This completes the proof. ■ 
Note that this result is only nontrivial if 

r(r + 3) 

m > — ^ 

2 

as otherwise Theorem |^ would simply state that the probability of error is less than one. Further, we 
have no specific reason to believe that the bound is tight (in the sense that < could be replaced by =) 
in the n < m case, even in the cases where the bound is non-trivial. An indication of this is given in 
Fig. |5] where the diversity of the SDR detector seems to be larger than 2 which is predicted by (B51 . It 
is however also unreasonable to expect the bound to be very loose in the sense that the SDR detector 
would maintain the same diversity as the ML detector in the general case where n < m. This is indicated 
by Fig. 0] where the error probability of the SDR is significantly larger than that of the ML detector. 
Intuitively, in the n < m case, it can become likely that a matrix with higher rank than one achieves 
the minimum in il2l . Therefore, the typical error events of the SDR detector no longer coincide with 
the error events of the ML detector and the SDR detector can experience a loss in diversity. We do not 
however, as pointed out above, expect the loss to be as large as what is indicated by d35l . 

A possible way to strengthen the analysis in the n < m case can actually be seen by turning back 
to Fig. 12 Essentially, as part of proving Theorem ^ (and Theorem |2) the intersection of X and H is 
covered with e-balls. However, due to the linearity of the objective function it is already known that the 
minimum objective value over the intersection set must be achieved by one of the boundary points of 
X. Therefore, it would suffice to cover the intersection of Tl with the boundary of X. This would in 
turn strengthen the bound on |2l| in Lemma |4] but would also require a framework for parameterizing 
the boundary set. It may also be possible to use the structure of the problem in other ways. One such 
way could be to make use of the results in [29] (where bounds on the rank of extremal matrices for 
semidefinite programs are provided) to further limit the part of the feasible set that needs to be covered. 
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B. Complex channel matrices 

It is well known that the SDR receiver is also applicable to the case where 4-QAM symbols are 
transmitted over a complex valued MIMO channel, see e.g. [5]. The most direct strategy is to rewrite 
the problem in an equivalent real valued form according to 

9(yc) 

where yc G C^, He € C^^^^, Sc G C*^ and Vc S are the (to Q) corresponding complex valued 
quantities and where and denote the real and imaginary parts. 

However, the proof of Theorem Q does unfortunately not extend to cover this case. The specific reason 
is found in Lemma |3l where the rotational symmetry of H is explicitly used. This symmetry is lost in the 
formulation given in d36b . even in the case where He is i.i.d. compex, circularly symmetric, zero mean 
Gaussian. More importantly, numerical simulations suggest that the extension of Theorem Q to this case 
may not even be true. An indication of this can be seen in Fig. |5] where it is plausible to believe that the 
SDR receiver does experience a loss of diversity. However, it should also be pointed out that we do not 
expect the loss (if any) to be very large in general. This belief is based on extensive simulations, such 
as the one shown in Fig. |51 that indicates a high SDR diversity in the complex case. 

At first sight, what would be required in order to cover the complex case would be to update Lemma |3] 
for the structure of the effective channel matrix, H, in (1^ . It is however also likely that Lemma |3] 
would need to be strengthened (as discussed in Section IVI-Al i in order to obtain a tight bound on the 
diversity. However, these steps remain a challenge. Also, note that if the SDR detector does not achieve 
full diversity, the issue of providing a lower bound on the error probability (or equivalently an upper 
bound on diversity) will also become more challenging. 

VII. Conclusions 

In this paper we have shown that when applied to a fading channel, modelled by a real valued matrix 
with i.i.d. Gaussian entries of zero mean and finite variance, the semidefinite relaxation detector achieves 
the maximum possible diversity. This provides a strong performance guarantee for the SDR approach, 
when applied in the communications context. Based on the discussions in Section |^ it does not seem 
reasonable to expect such a strong statement to hold for an arbitrary system. Nonetheless, it is still 
reasonable to assume that the SDR detector will be superior to the class of linear detector and other 
relaxation techniques. 



3f?(H,) -9(H,) 
9(H,) 5R(H,) 







+ 








9(Vc) 



(36) 
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Appendix I 
Exponential Equality 

For the readers convenience, we list the (for this work) most important properties associated with the 
definition of exponential equality in (Q. These properties are easily derived from the definition in (Q 
and can also be found (often implicitly) in many texts, see e.g. [27], [2]. Thus, we state the properties 
without proof. 

1) Scaling property: For any a G [—oo, oo] and c G (— oo, oo) it holds that 

f{p) = p-- cf{p) = p-^ (37a) 

2) Summation property: For any a,b [— oo, oo] it holds that 

fip) = P'", 9{P) = P-' fip) + 9ip) = p-^^^^''^ (37b) 

This property extends in the obvious way to the sum of finitely many terms. 

3) Multiplication property: For any a, 6 G [— oo,cxd] it holds that 

f{p) = p-^ g{p) = p-' ^ f{p)g{p) = p-^^+'^ (37c) 

if the cases where a + 6 is not well defined are excluded. 

4) Extremal realizations of Gaussian vectors: Let h G M"^ be a vector of i.i.d. Gaussian elements of 
finite non-zero variance. Then 

P (||hf < p-') = p-"^ (37d) 
for c G (—00,00), where c+ = max(c, 0) and 

P (||hf > p-) = p-^ (37e) 

for c > 0. These properties follow by noting that ||hp is distributed with d degrees of freedom, 

see e.g. [2, Section 5.4.2]. 
It should also be noted that the properties given in i37ai . (I37bt and (I37ct also hold with < or > in place 
of =. 



Appendix II 
Proof of Lemma 4 

Before proving Lemma |3 we establish the following technical result regarding the feasible set of (I12t . 
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Lemma 6: The set 3^ defined in M3\ satisfies 



3; = {Y E S"^ I TV(Y) = 1, Y ^ idd^,d = diag(Y)}. 



Proof: Consider the transformation given by 



Y a 




I -e 




I 








X 




T 

a c 




1 




-eT 1 



(38) 



(39) 



or inversely, 



X 



I e 




Y a 




I 


QT 1 




T 

a c 




eT 1 



R 



since T"^ = R. Note also that Y is given by Y = MXM'^ as M 
from (I40t yields 

Y + ae'^ + ea"^ + ece'^ a + ec 



I -e 



(40) 



by ( fTUb . Expanding X 



X 



a'^ + ce^ 



Thus, the constraint diag(X) = e for X e A:" impUes that c = 1 for Y € 3^ since y cy = M.X'Wl^ 
for y given in (I13t and where 3^ is given in (I14ll . Further, for c = 1 



diag(Y + ae"^ + ea'^ + ee"^) = diag(Y) + 2a + e = e 



which implies that 



a = -idiag(Y). (41) 

Thus, given a matrix Y G 3^ there is actually a unique X G for which Y = MXM^. In other words, 
the mapping from A" to 3^ is one-to-one. 

Since T (and R) are invertible the constraint X ^ is equivalent to P ^ 0. However, P ^ if and 
only if its Schur complement [20] is positive semidefinite, i.e. if 

Y - c-^aa^ ^ 0. 

Thus, by combining (14111 with c = 1 and identifying d = —2a the equalities of M31 and (I38t are 
established. ■ 
We are now in a position to prove the statement given by Lemma |4] For convenience the lemma is 
restated below. 
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Lemma^ Let ^(a, b) and ^^(A) be defined as in i25l and illl respectively. Then there is a collection 
of points, 21 = {Aj}, for which 

^(a,b) C U A(Ai) 

and 

|2l| 

where 

{m - k + 2){1 - bk)+ 



2 

k=2 

Proof: Consider the triplet (U, A, z) G i^"^x™ x x and the system of equations given by 

Tr(A2) = 1 (42a) 
diag(UA2u^) = Uz (42b) 
U'^U = I (42c) 
A^ - izz"^ ^ (42d) 

where A = diag(A). The set of solutions to i42l will in what follows be denoted by M-. The set of 
solutions to (I42at . (I42bt and (I42ct but not necessarily (I42dt is denoted by M and it follows that Ai C AA. 
From ( I42at and i42cl it follows that A and U in the solution set are bounded. However, as U is full 
rank due to (I42ct it follows through (I42bt that z is also bounded. Therefore, both Af and M. are compact 
(closed and bounded) sets. 

The constraints of i42l are such that any solution, (U, A,z), of (l42t satisfies UA^U^ G and any 
eigenvalue decomposition, Y = USU^, of Y G 3^ solves (l42l for A = XI 2 and some (unique) z. To see 
this, consider the eigenvalue decomposition, Y = UXIU^, of some Y G where y is given by il3l . 
Note also that Y belongs to y if and only if it satisfies the constraints of d38b as proven in Lemma |6l 
The orthogonality of U G M^^™ is a property of the eigenvalue decomposition and therefore (I42ct is 
satisfied. For A = S2 and z = U"'"diag(UA^U"'") the constraint of (I42bl) is satisfied. As Y G 3^ it 
follows that Y — jdd^ >z where d = diag(Y). Therefore, diag(Y) = diag(UA^U^) = Uz implies 

UA^U"^ - iUzz'^U'^ ^ A^ - izz"^ >z 

which means that (I42dl) is satisfied. Finally, the constraint Tr(Y) = 1 in d38t implies Tr(A^) = 1 
and ( I42al ) is satisfied. Reversing the reasoning and applying Lemma 15] show that any solution to (H^ 
must also have the property that UA^U^ G 3^. 
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The value of introducing (I42l is that it will, through the implicit function theorem [31], provide a 
means of parameterizing the eigenvalues and vectors of Y G J^. To this end, let 

m(m + 1) 



p = m + 



+ 1, 



and w G M'^ be given by 



Define 



= + 2m, 



^ = (U,A,z). 



H 



according to 



Tr(A2 



diag(UA2u'^) - Uz 
svec(U'^U - I) 

and note that H{u>) = corresponds to (I42at . ( I42bt and i42cl . In the above, svec(-) referrers to the 
vector obtained by stacking the upper triangular part of a symmetric matrix into a vector. Let 

be a solution of i42l and I be an index set satisfying 

TC {!,..., g} (43) 

and 

\I\ = p. (44) 

Denote by ujj G W the vector of components in uj indexed by X and let u>jc G E'?~p be the vector 
consisting of the remaining components. The implicit function theorem [31] states that if 

dH{u) 



(45) 



then there is a neighborhood, U C M"^, containing u and a differentiable mapping 

g : ^ W 

satisfying uj = g{u>jc) for any u e U H H^^{{0}). 

Further (l45l implies the existence of a differentiable mapping 
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for which cj = ^p{^), where ^ = — ^i" £ ^, where V is an open subset of p containing 
and where TZ = i){V) C W. This mapping is easily obtained from g by including the components 
in cjjc and performing a translation to a neighborhood of 0. Thus, assuming that ( R3t is satisfied, the 
solution set of (l42b is locally parameterized hy q — p scalar parameters. It will in fact later be shown that 
given any solution, cj, to \A2\ there will be some index set, X, satisfying d43t and (l44b for which (l45b 
is satisfied. This implies that is a g — p dimensional (smooth) manifold embedded in [32]. Note 
however that the specific index set, T, required to satisfy (l45l) will generally depend on the particular 
chosen. This is analogous to the problem of parameterizing the unit circle based on solving + = 1 
where the choice of x or y as the free parameter depends on if the parametrization neighborhood should 
include x = or y = 0. 

Note that it can without loss of generality be assumed that the domain of ifj, is given by 

V = {-k,kY-p, (46) 

i.e. that V is an open hypercube for some k > [32]. Further, since J\f is compact it can be assumed 
that K is independent of cj. It can also, without loss of generahty, be assumed that ijj is Lipschitz 
continuous [33] on V. This follows since the inverse function theorem guarantees that has continuous 
derivatives on the closure of V, V (actually, in its standard form the inverse function theorem guarantees 
continuous derivatives on V but by reducing k if necessary the continuity can be extended to the closure 
of V). Further, again due to the compactness of M, it can be assumed that the Lipschitz constant of ip 
is independent of u. 

In order to prove the existence of an index set, X, for which (H31 is satisfied it is sufficient to prove 
that the Jacobian matrix D, 

D 4 G MPX'?, (47) 



is full rank. In this event, the index set, X, can be taken as the indexes of any p linearly independent 
columns of D. For our purposes however, we shall need to be a bit more specific about how X is chosen. 
Therefore, note again that it will be of particular interest to study parameterizations of Ml (and M) around 
solutions Q corresponding to rank deficient Y G 3^ (see the discussion in Section IV-A.3t . To this end, 
consider some u £ M for which A^+i = . . . = Xm = 0, i.e. u corresponds to a rank r matrix Y ^ y. 
Here, and in what follows, and refer to the kth component of A and z respectively. For any a) G 
it follows by (I42dt that \zk\ < 2|Afc| for A; = 1, . . . , m and in particular it follows that = whenever 
Xk = 0. We will in what follows refer to any u ^ J\f which satisfies both A^+i = . . . = Am = 
and Zf+i = . . . = = as a rank r point, even in the case that Q ^ J\A. The reason for using 
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this terminology is that it is often difficult to verify that (I42(H is satisfied but sufficient to provide a 
parametrization around rank r points, a> G J\f. 
Let 

Pr = m-\ h 1 



and 



Qr = r{m + 2) 



and note that p = pm and q = Qm- Further, let denote the kth column of U. It will in what follows 
be shown that u>, in a neighborhood of a rank r point, a>, can be parameterized by specifying and 
Zfc for A; = r + 1, . . . , m, a subset of m — k parameters from for A; = r + 1, . . . , m, and a subset of 
Qr — Pr parameters from 

UJr = (ui, . . . , U,., Xi, . . . , \r, Zi, . . . , Z-r). 

It is straightforward to verify that this amounts to a total of q — p parameters. The specific parameters 
chosen from for /c = r + 1, . . . , m and from Ur will remain unspecified. In line with the previous 
discussion these must ultimately depend on the specific u around which Ai or M is parameterized. 

Before proving the preceding statement consider first the slightly more general system of equations 
given by 

Tr(A^) + rj = l (48a) 

diag(U,.A,,U,) + 7 = U^z^ (48b) 

VjUr = I (48c) 

where (U,., A,., z^, 7, rj) G M^x*" x M^' x W' x M™ x for some r, 1 < r < m. For now, it is sufficient 
to view the addition of 7 and rj as (small) perturbations of the constraints in ( I48t . These will later be 
used to develop a perturbation analysis of the solutions to (I42t around the rank r points. 
Let 

u^T- = (XJ^,A7-,z^) 



and define uj^ analogously. Define 
according to 



Tr(A2) +r/- 1 
diag(Ur.A2uJ) + 7 - U^-z^ 
svec(UTUr - I) 
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and note that Hr{u>r,'~i ,ri) = is equivalent to (I48t . In order to establish that the solution set of (l48l 
can (locally around a particular solution (a>r,0,0)) be parameterized by — Pr + m + l parameters it is 
sufficient to establish that the Jacobian 

dHr{uSr) 



duJr 



G 



(49) 



is full rank when evaluated at Ur satisfying Hr{u)r, 0, 0) = 0. 

Note that, similarly to before, if D^. in ( I49t is full rank then this implies the existence of a Lipschitz 
continuous function 

Ipr ■■'Dr^ Tlr (50) 

where (11,., A,., z^) = V'r(^r)7)^) for ^ K'^''"^'', where Vr G ]^9r-Pr+m+i open neighborhood 
of 0, and where TZr = Lpr{'Dr). Also, without loss of generality it can be assumed that 

g,, — p,,+m+l 



In order to establish the full rank property of D,. consider the matrix 



5(gl\---,gm>Z?',Aj^) 



where is the fcth row of Ur, i.e. 



ui 



gl 



gn 



Note that is related to by a permutation of the columns (due to a changed order of differentiation) 
and that D,. is full rank if and only if Dr is full rank. Computing (semi) explicitly yields 



2gTA2-zT ... gT 2A,g2 



where 



0^ 

A dGr{\Jr) 



2g^A^ Z,^ g^ 2A.rSm 
Gm 

for G,.(Ur) = svec(Uj^Ur - I) 



and where gf denotes element wise squaring of g,. Assume first that 2g^A^ — zj = for some i, 
1 < i < m. This implies through ( I48bt (and 7 = 0) that 



g^^A^g, = 2gTA2g, 
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and in turn A^gj = as ^ 0. Further, it follows that = and that A^ = by inserting = 
into (I48bt . This however violates (I48at and contradicts that u^r is a solution to d48t . Thus, it can be 
assumed that 2g^A^ — z^ 7^ for all i = 1, . . . , m which implies that the first m + 1 rows of are 
linearly independent. 

Establishing that the last r(r + l)/2 rows of D^. are linearly independent is a standard exercise in 
proving that the (m, r)-Stiefel manifold (the set of m by r unitary matrices) has dimension 

rir + 1) 

mr 

2 

which is a well known result [32]. We will for this reason not provide an explicit proof of this. In fact, 
the last r(r + l)/2 rows of D,. are not only linearly independent but also orthogonal. 

What now remains to be done, in order to show that is full rank, is to prove that none of the first 
m + 1 rows can be written as a linear combination of the remaining r(r + l)/2 rows. For the first row, 
this is obvious due to the structure of together with 7^ 0. For the next m rows the only potential 
problem would be if gj = for some i. However, as 

m 

Gr(Ur) = svec(Uj!"Ur — I) = svec(gjgf ) — svec(I) 

i=l 

it follows that Gj is linear in gj and equal to zero whenever gj = 0. Together with the property that 
2g?"A^ — zj 7^ it follows that none of the first m + 1 rows can be formed as a linear combination of 
the remaining r(r + l)/2 rows. This establishes that D,-, and D^, are full rank. Note that as 

D = D^ 

it also follows that the assertion of ( R3t has been proven. 

Consider again the parametrization of J\f around some rank r u> £ M and consider the matrix 

dHiu) 



5(cJr,Ur+l,...,Um 

Note that P is nothing more than D with the columns corresponding to Afc and for /c = r + 1, 
removed. It is straightforward to verify that P is structured as 



, m 



where 



X 
X 
X 

Ul 







X 
X 








(51) 



Ufc-i 2ufc 



(52) 
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and where Uj is the zth column of U in (tj, A, z) = u. The structure of d52b follows by differentiating 
svec(UjUr — I) with respect to the A;th column of (remember that svec forms a vector of the upper 
triangular part of its matrix argument). Note that G j^fcxm j:^^ j-^j^j^ k, 1 < k < m, (as the 

rows are orthogonal) and that E MP'-^i'- is full rank as proven earlier. By considering the structure 
of P it follows that a linearly independent set of columns can be selected by choosing pr columns form 
the set of columns containing and k columns from each set containing F^. for k = r + 1, . . . ,m. 
This, as elaborated on earlier, is however equivalent to the statement that the set of solutions to (l42l can 
locally around a> be parameterized by specifying — Pr parameters from Ur, m — k parameters from 
Ufc along with and for k = r + \, . . . ,m. 

Now, turn attention to the original problem posed by Lemma 0] that is, the problem of obtaining a 
covering of ^(a, b) defined in (l25l and where a = (ai, • • • , am), b = • • • , bm) and < 6i < . . . < 
hjn- Let r be the maximum integer for which 

= 6i = . . . = 6r < < • • • < 

As stated earlier, if 6i > then ^(a, b) will be empty for sufficiently small e. It is thus safe to assume 
that 6i = and r > 1. Further, it can without loss of generality be assumed that e is arbitrary smaU. In 
particular, it can be assumed that 

€2 < K 

where n is the constant introduced in d46t . 
Consider the set 

M{h) ^7Wn{(U,A,z) I |Ai| <e^}. 

The set M.{h) is chosen such that any matrix A G ^(a, b) can be expressed as A = UA for some 
(U, A, z) G M.{h). Thus, the parametrization of A4(b) will also provide a parametrization of ^(a, b). 
Let {V'^'''}^! be a set of parameterizations (around rank r points) such that 

L 

7W(b) c |J7^(') (53) 
1=1 

where 7^(^) = ip^'-^V). The assumption that < k ensures that it is suffice to consider 

parameterizations around rank r points, cj G M, in order to cover A4(b). Note also that by the assumption 
in (I46t the coordinate neighborhoods of ip^''^ are all equal to V. Further, since A^(b) C M is compact 
(and since TZ^^^ is open) it can be assumed that L is finite [31]. Define D(')(b) according to 

p(0(b) 4^/,-i(7w(b)n7e(')) 
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and note that P^') (b) C V. Finally, define 

:p(') (b) ^ { A I 3z, (U,X,z)eM (b) n 7^(') , A = UA} 
where A = Diag(A) and note that 

L 

A{ai,h)c\JP^'\h). (54) 

1=1 

So far, the existence of a specific parametrization, given by X, has been proven. However, not much 
has been said regarding the properties of this particular parametrization. Thus, to specify the benefits of 
the particular parametrization chosen, let in the parameter vector ^ the components obtained by selecting 
a subset of (ui, Xi,zi,. . . , u^, A^, Zr) be denoted by Or G M^*^"^". Similarly, let the components obtained 
from Ufc, for A; = r + 1, . . . , m be denoted by Ok € W^~''. That is, 

^ {^ri ■^r+li ^r+li • • • i ^mi ^m)- 

Further, introduce ^ and | and partition these analogously. Assume that G P(')(b), let (U, A,z) = 
V'('H^) and (U,A,z) = V^'HO and let A = UA and A = UA where A = Diag(A). Further, let 
A = A — A, i.e. A is the perturbation in A resulting from a perturbation, ^ — ^ — ^, of ^. The objective 
is now to show that if ^ G C where 

^ ~ ~ 1 ~ l~^fc ~ 1 

C = I ll^rlloo < ce5, ll^jtlloo < ce^~, |Ajk| < ce^, 
|2fc|<ce2, k = r + 1, . . . ,m) 

and c is some (yet to be defined) constant it will follow that 

||A- All = ||A|| < ei (55) 

In the above and in the following, Afe, Afe, and z^ refer to the feth component of A, A, z and z 
respectively. 

Let Ufc and Ufe denote the fcth columns of U and U. Let 

(U,A,z) = (U,A,z)-(U,A,z) 

and let denote the fcth column of U. The first step is to prove that ||ujk||oo < cKj-e^^ for some 
constant K^. Note that since 6i < . . . < 6^ it follows immediately from the Lipschitz continuity of 
that ||Um|| < cK^e 2 for some constant K^- This is since e ^ = forA;<m implies that 

ll^lloo < ce~^^ and Km could simply be selected as the Lipschitz constant (in 00-norm) of i/j. 
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For k < m, let G j^m^x^" be the matrix consisting of the first k columns of U, let G M'^ the 
vector of the first k elements of A and let G be the vector of the first k elements of z. Assume 
that ||ui|| < cKje^~ for some k <i <m and note that (U^, Afc,Zfc) must satisfy ( I48t for 

m 

1= Yl '^?diag(uiuf ) - UiZi 

i=k+l 

and 

m 
i=k+l 

Note also that, by the structure of P in (15 U it follows that 

1pk{(^r, K+1, Zr+l, ■,0k, Afc, Zk,^, t]) (56) 

where Tpk is the function given by the implicit function theorem in dSOl l. By expanding 

m 

7 = E \^diag(uiUi^) - UiZi 

i=k+l 
m 

i=k+l 
- (Ui + Ui){Zi + Zi) 

and 

m m 

it is straightforward to show that 7 = 7 ~ 7 ^i^cl fj = t) — rj satisfies 

II7II00 < cKke^ and |??| < cKke^ 

for some constant Kk- In essence, the potentially large perturbation (on the order or e 2 ) in 0j for 
i, k < i < m is always multiplied by factors on the order of e~ which results in a perturbation, 7, 
on the order of ea. Note also that it is impUcitly assumed that e is such that cK^e^ < k or otherwise 
(a;,., 7,77) ^ Vr- However, as e can be assumed arbitrary small this is not a problem. 
By the Lipschitz continuity of ipk in dSOb . it follows that 

llufcll < cKke 2 

for some constant since the argument in i56l is bounded by 

max(ce 2 ^cKke^)<cKke 2 . 
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By induction it follows that ||ufc|P < cK^e 2 for /c = r + 1, . . . ,m and ||ufc|| < cK^e^ for = 1, . . . , r 
where Kj^, k = r, . . . ,m, are constants independent of e and c. Now, by expanding 

A =UA = (U + U)(A + A) 
=UA + UA + UA + UA 

it follows that A = A — A satisfies ||A|| < cKe^ for some constant, K. Finally, by selecting c according 
to c = it follows that 

||A|| = ||A - A|| < ei 

What has been shown so far is that a perturbation, ^, around a point, ^, in the parameter space 2?^'^ 
will, given that ^ G C, result in a perturbation of A, A, which satisfies ||A|| < . This implies that 
given a set of ^ G p(')(b), {^^''^^If^i, for which 

i=l 

where 

ci$) = c + t 

we will also have a covering of V^''^ (b) given by 

p(')(b) C U A(A(''*)) (57) 
1=1 

where A(''*) = u(''*)A(^'*), 

AC'^) = Diag(A(''^)) and where A (A) is defined in dTTt . However, as C(^) is simply a (rectangular) 
box centered at ^ and since 

p(')(b) c I ||6>,||oo < 2, ||6>fe|U < 1, lAfcl < e^, 

kfcl < 2e^, A; = r + 1, . . . ,m} (58) 

it follows that could be chosen such that 

I<e~'' 

where 

_ (gr-Pr) , ^ (m-fc)(l-fefc)+ , 2(l-6fc) + 

f^- 2 ^ ^ 2 ^ 2 • 

A:=r+1 
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This follows from the general statement that in order to cover a large Af-dimensional box with side 
lengths e^' , i = 1, . . . , M, with small boxes of side length e"', i = 1, . . . , M, one needs (in the = sense) 

M 

JJg-(a.-ft)+ = g-Efli(".-ft)+ 
i=l 

small boxes in total. Note also that if Oj < /3j the "small" boxes are actually wider than the large box in 
the ith dimension which is the reason for the (oj — Pi)^ expression as opposed to (aj — Pi). 
By noting that 

Qr — Pr = (m + 2)r — m 1 = 2. rn — k + 2 

k=2 

and using the assumption that 6^ = for /c = 1, . . . , r it follows that /u can be written as 



{m - k + 2){1 - bkY 
2 



k=2 

Thus, it has so far been shown that it is possible to cover by sets A(Ai). By (|54|l and 

since L was finite this result extends to the covering of ^(a, b). That is, it has been shown that there 
exists a covering, 21, which satisfies 



^(a,b) C U MA, 



A,e2l 



and 



|2l| <e-'^ 



as was asserted by Lemma H] 
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Fig. 1. The probability of error when H e R"^"* has i.i.d. real valued Gaussian entries, and where m = n = 4. 




Fig. 2. Illustration of the feasible set, X, of the SDR detector in j5j 
that are close to and far from Xe. 



The hyperplane Ti. separates points in the feasible set 




Fig. 3. The probability of error when H € R"^"* has i.i.d. real valued Gaussian entries, and where m = 4 and n = 3. 




Fig. 4. The probability of error when H € R"^"* has i.i.d. real valued Gaussian entries, and where m = 4 and n = 2. 
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Fig. 5. The probability of error when H € C^***^ has i.i.d. complex valued Gaussian entries, and where N = M = 2. 




